Database management system

ABSTRACT

A database system configures a storage model in accordance with a hierarchical tree-like structure to enable fast and comprehensive data extraction functions. A plurality of entities, attributes and entity occurrences are each assigned a unique, multi-character expression. The expression has a predetermined hierarchical structure which defines the relationship between each entity, attribute and entity occurrence with every other entity, attribute and entity occurrence. The expressions are stored in an expression set table linking each element of each expression with a natural language phrase or data definition. Events are recorded in an entity history table each event having an associated expression. Data is extracted from the database according to a multi-character query expression comprising characters that are deterministic to the query and characters that are non-deterministic to the query. Data extracted from the database is also filtered according to a plurality of multi-character profile expressions each comprising characters that are deterministic and characters that are non-deterministic and which together define filtration criteria.

The present invention relates to database systems, and in particular toa database system which configures a data model in accordance with ahierarchical tree-like structure which enables fast and comprehensivedata extraction, querying and output display functions.

There are presently very many ways of constructing and maintainingdatabase structures on computer systems. As is well known, therelational database is widely used. In a relational database, everyentity in a data model has a number of attributes which may be accordedvalues selected either from discrete sets of values, or from withinranges of continuously variable values. All entity occurrences havingthe same attribute types are stored in a relation table, with eachentity occurrence occupying a row, or tuple of the table having a fieldor element corresponding to each attribute. Each field of the rowcontains an alphanumeric value for the relevant attribute value.Separate tables are provided for different entities each having adifferent set of attributes.

The data model, or representation of the relationships between thedifferent entities, is provided both implicitly by the incidence ofcommon attributes between the various relation tables, and also byimposing conditions on various attributes such as their identificationas key fields.

In extracting data from the database, a query is formulated, in suitableprogramming language, which instructs the data processing system to scanselected attribute columns of specified tables for adherence to certainconditions, and to output, usually into an output table, the data inpreselected attribute columns for each tuple or row of the scanned tableor tables. The output table can then be browsed by the user on screen,or printed out.

A number of disadvantages present themselves with this technique.Queries must be formulated using particular query languages which mustbe learnt by the users. Although these are commonly interfaced with a“natural language” interface making their use easier for the non-expertuser, certain rules and protocols must be understood.

A further disadvantage is that the queries are quite specific, and donot generally permit what we shall call “progressive browsing”: that isto say, once a query has been formulated, the resulting output table isproduced, and the information contained therein is fixed and limited tothe scope of the original query. Further scanning of the output table ispossible by formulating a further query to reduce the size of the outputby imposing additional limitations on the ranges of values that anattribute may take, for example, but generally, for browsing through thedatabase, a new query must be formulated each time to scan theappropriate parts of the database. In general (except where the selectedattribute is the index field), in re-scanning the output table(s) toanswer a “sub-query”, the whole of the table or tables must be searchedfor adherence to the new selection criteria.

In processing a query, it is normally necessary to perform quite complexmanipulations on the various tables involved in the query, which includejoining or merging operations, and the temporary creation ofintermediate tables to be used as the operands for subsequent parts ofthe query. Such operations naturally involve considerable processingpower and time to carry out.

A further disadvantage is that the relational database must generally bedesigned and constructed to conform to the data model representing theorganisation of interest. This is typically performed by a skilledanalyst, and is not particularly flexible once set up.

Relational databases also provide for the generation of user-specificviews of the extracted data. For example, classes of different users maybe permitted to view or access contents of only certain tables, orcertain portions of tables in the database. This is typicallyimplemented by providing an access control mechanism that preventsaccess to, or display of results from, predetermined tables, tuples oftables, and/or columns of tables according to the user identityimplementing the query. This may be effected by an access specificationthat is used by the system when generating the query output to determinewhich tables may be accessed by a particular user or class of user.

An innovative database management system that offers considerablebenefits over the relational database systems referred to above has beendescribed in GB 2293667B, relevant parts of which are reproduced in thisspecification.

The present invention is particularly concerned with techniques forimproving the functionality of the database system described in GB '667particularly in respect of user-specific functions and providingenhanced output options.

According to one aspect, the invention provides a method of operating adatabase system comprising the steps of:

-   -   assigning to each of a plurality of entities, attributes and        entity occurrences, a unique, multi-character expression, the        expression having a predetermined hierarchical structure which        defines the relationship between each entity, attribute and        entity occurrence;    -   storing said expressions in an expression set table linking each        element of each expression with a data definition relating the        expression to a hierarchical level and a position in a data        model;    -   recording events in an entity history table, each event having        associated therewith a relevant expression from the expression        set table;    -   extracting records from the database according to a        multi-character query expression comprising characters which are        deterministic to the query and characters which are not        deterministic to the query;    -   filtering the extracted records according to a plurality of        multi-character profile expressions each comprising characters        that are deterministic and characters that are non-deterministic        and which together define filtration criteria;    -   outputting only extracted records that meet the filtration        criteria and match the query expression.

According to another aspect, the present invention provides a method ofoperating a database system comprising the steps of:

-   -   assigning to each of a plurality of entities, attributes and        entity occurrences, a unique, multi-character expression, the        expression having a predetermined hierarchical structure which        defines the relationship between each entity, attribute and        entity occurrence;    -   storing said expressions in an expression set table linking each        element of each expression with a data definition relating the        expression to a hierarchical level and a position in a data        model;    -   recording events in an entity history table, each event having        associated therewith a relevant expression from the expression        set table;    -   extracting records from the database according to a Boolean        combination of (i) a multi-character query expression comprising        characters which are deterministic to the query and characters        which are not deterministic to the query and (ii) a plurality of        multi-character profile expressions each profile expression        comprising characters that are deterministic to predetermined        filtration criteria and characters that are non-deterministic to        predetermined filtration criteria, by selecting records in which        every deterministic character of the Boolean combination matches        a corresponding deterministic character in said expressions in        the database; and    -   outputting said extracted records.

According to another aspect, the present invention provides databaseapparatus comprising:

-   -   means for storing, for each of a plurality of entities,        attributes and entity occurrences a unique, multi-character        expression, the expression having a predetermined hierarchical        structure which defines the relationship between each entity,        attribute and entity occurrence;    -   means for storing said expressions in an expression set table        linking each element of each expression with a data definition        relating the expression to a hierarchical level and a position        in a data model;    -   means for storing, in an entity history table, a plurality of        recorded events, each event having associated therewith a        relevant expression from the expression set table;    -   a query processor for extracting records from the database        according to a multi-character query expression comprising        characters that are deterministic to the query and characters        that are not deterministic to the query;    -   a profile processor for filtering the extracted records        according to a plurality of multi-character profile expressions        each comprising characters that are deterministic and characters        that are non-deterministic and which together define filtration        criteria; and    -   output means for generating an output of all extracted records        that match the filtration criteria.

According to another aspect, the present invention provides a method ofoperating a database system comprising the steps of:

-   -   assigning to each of a plurality of entities, attributes and        entity occurrences, a unique, multi-character expression, the        expression having a predetermined hierarchical structure which        defines the relationship between each entity, attribute and        entity occurrence;    -   storing said expressions in an expression set table linking each        element of each expression with a data definition relating the        expression to a hierarchical level and a position in a data        model;    -   recording events in an entity history table, each event having        an event time and a relevant expression from the expression set        table associated therewith; and    -   extracting records from the entity history table according to a        multi-character query expression comprising characters that are        deterministic to the query and characters that are not        deterministic to the query, and according to a predetermined        time window function.

The present invention will now be described by way of example, and withreference to the accompanying drawings in which:

FIG. 1 shows an exemplary data model useful in describing the presentinvention;

FIG. 2 shows a root expression and extension expression in accordance;with those used in the present invention,

FIG. 3 shows a symbolic portion of a data model useful in explainingaspects of the present invention;

FIG. 4 shows a pair of expressions to illustrate context switch linkstherebetween;

FIG. 5 shows a plurality of table structures and theirinter-relationship which can be used in the implementation of thepresent invention;

FIGS. 6 a to 6 d show portions of exemplary expression set tables;

FIG. 7 shows corresponding portions of an expression set table orsub-tables showing differing data definitions relating to differentclasses of user;

FIG. 8 shows corresponding portions of an expression set table orsub-tables, in which data definitions relating to different classes ofuser do not have a one-to-one correspondence;

FIG. 9 shows a profile processor according to one aspect of the presentinvention; and

FIGS. 10 to 12 show graphically, events that may be recorded in anentity history table and the chronology thereof.

THE DATA MODEL

In the present invention, the physical model, ie. the storage modelwhich represents the physical structure of the data stored on thecomputer system is designed to be much closer to a conceptual model ofthe real world, ie. the data model of the organisation(s) using thedatabase. This closeness is normally difficult to achieve, simplybecause the requirements of the computer-accessed disks and otherstorage media are so different from the human view of the organisationalstructure being represented by the database. A database implementationwhich can simplify the interface between the physical model and theconceptual model offers huge advantages in terms of the speed ofprocessing when accessing information from the database, and alsogreatly simplifies the software and hardware interface necessary toachieve this interface.

In one embodiment, every entity, every attribute and every occurrence ofevery entity in the data model is uniquely specified by amulti-character “expression” which may conveniently (for the sake ofclarity of explanation) be divided into a number of “words”. Asillustrated hereinafter, the “expression” may comprise three five-bytewords, with each byte representing one ASCII character selected from aset of approximately 200. The number of “words”, however, is notcritical to the invention and merely imposes a convenient semanticstructure to the expressions as they relate to the data model.

It will be understood that the number of bytes representing a characterin the expression, or the length of the overall expression, can bevaried according to the requirements of a particular system. In apresently preferred embodiment, the multi-character expression is formedfrom twenty two-byte characters or “elements”, so that each element mayrepresent any one of 65536 possible different characters.

The expressions do more than simply provide a unique label to eachentity, each attribute and each occurrence of each entity, but alsoimplicitly encode the data model by reference to its hierarchicalstructure and protocol. This is achieved by use of the stricthierarchical protocol in the assignment of expressions to each entity.This can be achieved automatically by the database management systemwhen the user is initially setting up the database, or preferably isimposed by a higher authority to enable the database structure toconform to wider standards thereby ensuring compatibility with otherusers of similar database systems.

The way in which the database structure is imposed by the assignment ofthese expressions is best described with reference to an exemplary datamodel as shown in FIG. 1.

The tree structure in FIG. 1 represents the “known universe” of the datamodel. Each hierarchical level of the data model is shown horizontallyacross the tree structure, and each one of these hierarchical levels maybe represented by an appropriate byte I₁ to I₁₅ of the expression shownvertically on the left hand side of the drawing. At the highest level ofthe tree I₁, we have context information defining the organisation usingthe data, for example the National Health Service, Prison Service, LocalAuthority, Educational Establishment etc.

The significance of byte I₂ will be discussed later, but broadlyspeaking indicates a data type from a plurality of possible data typesthat might be used. Within each organisation (eg. the Health Service)there may typically be a number of departments or functions or data viewtypes (represented by byte I₃) such as administration, finance/accountsand clinical staff, all of whom have different data requirements. Thesedifferent data requirements encompass:

-   a) different data structures or models reflecting different    organisational hierarchies within the department;-   b) different views of the same entities and occurrences of entities;    and-   c) the same or different views of “standard format” data relating to    different occurrences of similar or identical entities or    attributes.

The significance of this to the present invention will become clear asone progresses downward through the hierarchy.

Each department may wish to segregate activities (eg. for the purpose ofdata collection and analysis) to various regional parts of theorganisation: eg. a geographically administered area or asub-department. This can be reflected by expression byte I₄. Eachgeographically administered area may further be characterized by anumber of individual unit types, such as: (i) hospitals, health centresetc. in the case of an NHS application; (ii) schools or higher educationinstitutions in the case of an education application; (iii) prisons andremand centres in the case of the prison service application.

Each of the organisations and units above will have different datastructure requirements (as in (a) above) reflecting different entities,attributes and entity relationships within the organisation and theseare provided for by suitable allocation of codes within the I₆ to I₁₀range of expression bytes. In this case, the same alphanumeric codes inbytes I₆ to I₁₀ will have different meaning when in a branch of the treeunder NHS than when under, eg. the education branch, even though theyexist at the same hierarchical level. As an example, the sub-treestructure represented by particular values of bytes I₆ to I₁₀ may referto patient treatment records in the NHS context, whereas those values ofcodes may refer to pupil academic records in the education context.

However, in the case of (b) above, where the organisational unitrequires the same or different views of the same entities, attributesand occurrences of entities as other organisational units, the codes inbytes I₆ to I₁₀ of one branch of the tree will represent the sameunderlying structure and have the same meaning as corresponding bytevalues under another branch of the tree. An example of this is whereboth the administration departments and the finance departments requirea view of the personal details of the staff in the hospital, bothdoctors and nurses. Note that the views of the data may be the same ordifferent for each department, because the view specification isinferred from the higher level I₁ to I₅ fields. In this case, as will beexplained later, for entities, attributes and occurrences of entitieswhich are the same in each sub-branch, some or all of the codes I₁₁ toI₁₅ which identify each entity occurrence will have identical values.

In the case of (c) above, ie. the same or different views of standardformat data relating to different occurrences of similar or identicalentities and their attributes, it will be understood that a number ofpredefined bytes require the same specification regardless of theparticular organisation using them. For example, a sub-tree relating topersonnel records, and including a standard format data structure forrecording personnel names, addresses, National Insurance numbers, sex,date of birth, nationality etc. can be replicated for each branch of thetree in which it is required. For example, all of the organisations inthe tree will probably require such an employee data sub-tree, and thusby use of standardised codes in bytes I₆ to I₁₀ such organisationalsub-trees are effectively copied into different parts of the tree.However, in this case, the context information in fields I₁ to I₅ willindicate that within each organisation, we are actually dealing withdifferent occurrences of similar format data.

The tree structure defined by the expressions I₁ to I₁₅ can be used todefine not only all entity types, all entity attribute types and allentity occurrences, but can also be used to encode the actual attributevalues of each entity occurrence where such values are limited to adiscrete number of possible values. For example, in the subtree relatingto treatments in the NHS hospital context, “drug” is an entity which hasa relation with or is an attribute of, for example: doctors (from thepoint of view of treatments prescribed); patients (from the point ofview of treatments given); administration (from the point of view ofmaintaining stocks of drugs) and so on. The entire set of drugs used canbe provided for with an expression to identify each drug. In anillustrative embodiment, the parts of the expression specific to theoccurrences of each drug will be located in the I₁₁ to I₁₅ fields asshown in FIG. 1. Thus when used in conjunction with the appropriatefields I₁ to I₁₀, it will be apparent whether the specified drug is inthe context of a treatment prescribed by a doctor, a treatment receivedby a patient, or a stock to be held in the hospital pharmacy.

Further bytes in the expression, lower in the hierarchy can beassociated with the drug to describe, for example, quantities orstandard prescription types. It will be apparent whether the expressionrefers to a prescribed quantity or a stock quantity by reference to thecontext information found higher in the hierarchy. In practice, thenumber of discrete values allowed for each of these grouped “entityvalues” using the five fields I₁₁ to I₁₅ is approximately 200⁵=3.2×10¹¹.As will be described later, the number of permutations allowed canactually be expanded indefinitely, but in practice this has not beenfound to be necessary. It is noted, however, that the described model ofFIG. 1 merely illustrates a principle of the data model. In a presentlypreferred embodiment, twenty-character expressions are used and thesemantic significance of specific fields therein (I₁ to I₂₀) may differsignificantly from those presently described in connection with FIG. 1.For example, in a presently preferred model, “entity values” now occupyeach of the two-byte elements I₁₃ to I₂₀, thereby allowing 65536⁸discrete values (=3.4×10³⁸).

Thus, in the fifteen character expression I₁ to I₁₅, each characterrepresents a natural language expression (eg. English languageexpression) defining some aspect of the data model, and by travellingdownward through the table it is possible to compose a collection ofnatural language expressions which represents the complete specificationof an entity, an attribute or an entity occurrence.

Implementation

For the following detailed description of an implementation of thepresent invention, we shall use the following data modelling scheme,although it will be understood that the method of the present inventionapplies to a wide variety of data model designs. In this data model, allinformation may be regarded as being about either a “presentation”, oran “activity”. A presentation is some thing or notion which is presentedto the user. An activity is some thing or notion which is initiated bythe user.

However, in further embodiments, other classifications of informationmay be specified as desired. This does not affect the operatingprinciples of the database. For example, in a presently preferredarrangement, a third category of information—“diagnosis”—has beenimplemented, in the use of the invention in a medical context. In onecontext of medical use, the expression set has been adapted to conformto existing internationally recognised professional standards fordescription of mental diseases, ICD 10. In another context, theexpression set has been adapted to conform to a standard diagnosis setcalled DSM IV from the American Psychiatric Association.

For example, the tree structure of FIG. 1 is, largely, a presentation tothe user about the hierarchical structure of the organisation in whichthe user is working, and the lists of patients, doctors, nurses, schoolpupils and prisoners are all presentations of things in existence andtheir relationship with that data structure.

Activities may be regarded as events which the user records or initiateswhich affect the database—ie. treating a patient, updating a person'srecords, ordering further supplies of drugs etc. An activity is ofteninitiated in response to a presentation—that is to say, the doctor mayview the relevant records of the database presented to him in the formof patient treatments, and prescribe further treatment based thereon.

Information about presentations and activities can only be recorded whenassociated with an object. In other words, the “patient treatment” perse has only abstract meaning until it is linked with a particularpatient, doctor, hospital etc.

The exemplary database structure will be described with reference tofive activities which take place in the creation and maintenance of thedatabase.

“Registration” we describe as the recording of static information aboutan object's existence, which is all presentation information. Theregistration process itself can, however, be regarded as an activity.This process is embodied in the steps of constructing the tree structureof FIG. 1, and recording information about each entity at the “bottom”layer of the tree: ie. identifying drug no. 1 as “aspirin”.

“Profiling” we describe as recording information about the object'scondition, which information is likely to change, and may therefore beregarded as dynamic. The distinction between static and dynamicinformation is not a rigid one: static information can also be subjectto change, and dynamic information might not actually change. An exampleof dynamic information might be the assignation of a patient to acertain doctor for a certain type of treatment. Compare this with staticinformation represented by the recording of the existence of the patientin the data model by giving the patient a unique identifying number.Loosely speaking, the registration of static information is theidentification of each entity occurrence at the bottom of the treestructure of FIG. 1 (eg. DrugNol, DrugNo2, . . . DoctorID1, DoctorID2, .. . PatientID1, PatientID2 . . . ) and the profiling activitycorresponds to defining an entity occurrence's (PatientID1) relationshipwith the tree (eg. associating the patient with HospitalNo1 orDoctorID1).

“Response”, or “planned response”, is regarded as recording informationabout responses to an object's condition—ie. updating patient recordswith treatment details etc.

“Event logging” is regarded as recording information about a sequence ofevents associated with responses to an object's condition. This isactivity and presentation information. For example, it is necessary toensure that the history of a patient within the hospital can be trackedover a period of time, with details of all treatments and referralsindicated.

“Reporting” allows the user of the database to query the system toextract specified information therefrom.

To carry out all of these activities, a database engine uses theexpression set introduced above. As previously discussed in generalterms, a full expression consists of, in a presently describedembodiment, fifteen elements, divided into three groups of fiveelements, also known as words. The first word, I₁ to I₅, is reserved forcontext specification, ie. whose view the expression reflects, the senseof the expression and the domain of the expression. We shall call thisthe context word. The second word, I₆ to I₁₀, is reserved for specifyinga particular procedure, entity or event in the context specified by thecontext word. We shall call this the specification word. The third word,I₁₀ to I₁₅, is reserved for specifying qualitative information regardingthe procedure, entity or event. We shall call this the qualitative word.

Thus, in addition to defining the upper layers of the tree structureshown in and described with reference to FIG. 1, the context word canalso specify the sense of the data identified by the expression. Forexample, there may be codes embedded into the context word whichindicate that we are dealing with presentation information, activityinformation, or response information, etc. In the illustrated embodimentof FIG. 1, this “sense of the data” is indicated by the value of byteI₂.

The expression can thus be a complete description of a situation: aplace, an associated event and a frequency of occurrence; or perhaps aperson, an associated action and some measure of quality of that action.

This is in contrast to other coding systems which are usually of anatomic nature. In atomic coding systems, a particular code will describea particular feature, action or state. To fully describe a situation,then, an arbitrary number of codes must be grouped together in some way.For example, one code for a place, another for an event, perhaps anotherto complete the event description and then a qualifier of some sort.There is nothing in the codes to indicate the relationship these codeshave with one another.

The expressions used in accordance with the present invention have agrammar. Each expression indicates a context, a specification and aquality. If any of these components are unknown or irrelevant, then bydefault, the expression indicates that this is so by the use of “wildcard” characters (which we shall generally refer to as non-deterministiccharacters).

We also note, at this stage, that although the embodiment described hereuses an expression which is fifteen characters in length, to represent atree structure that is fifteen layers deep, different length expressionsor even extension trees are possible. The expression may include aunique code in the third word bytes I₁₁ to I₁₅ which indicates that theword represents a pointer to a further expression. For example, withreference to FIG. 2 there is shown an exemplary extension expression.Root expression 200 contains the three five-character words, with thefinal five characters representing the link to extension expression 201.The third word 202 includes a special designated character (shown as“X”) in position I₁₁ which indicates that the following four charactersin bytes I₁₂ to I₁₅ represent a pointer label to the extensionexpression 201. The pointer label is replicated in the first word 203 ofthe extension expression. Thus the presence of the “X” in byte I₁identifies the status of the expression as an extension expression. Inthe extension expression, the characters in bytes I₆ to I₁₅ represent afurther sub-tree appended to the main tree of FIG. 1. It will beunderstood that extension expressions may be used to greatly increasethe number of entity occurrences, attributes or ranges of possiblevalues of attributes over that which is provided by the rootexpressions.

A blank element in an expression is used to indicate that there is nofurther detail in an expression. Thus every element to the right of ablank element must also be blank. This can be understood by recognisingthat a branch of the tree in FIG. 1 cannot exist unless it is connectedto the root via branches hierarchically above it.

Where there is no specification at any position in the expression, thisis indicated by the wild card symbol “#”.

A feature of the use of expressions to describe the data model is thatsimilar data structures, or sub-trees, are replicated throughout themain tree by using similar expression patterns. For example, withreference to FIG. 3, sub-trees 301, 302 have the same structure, andsub-trees 303, 304 have the same structure.

There are three paths down through the hierarchical tree of FIG. 3. Withtraditional hierarchical browsing systems, the user would explore theirway down the tree to the extremities. This is also the case with thepresent invention. However, the use of the expression sets also providesthe ability to jump to other similar places in the expression set. Thiscan be done in all hierarchical systems by back-tracking until a branchis reached where an alternative route is decided upon. Then the userwould explore this new route until the required branch is reached. Withreference to FIG. 3, at least six steps are required to get from pathCBBCF to CLBCF. Each step presents the opportunity of making a wrongdecision that would delay the finding of the correct data. The use ofexpressions allows one step jumping from sub-tree 301, or 303 tosub-tree 302, or 304 which would otherwise need many back and forwardtracking steps to be made.

This is achieved by the positional integrity, or the “place value” ofthe characters within the expressions. It can be seen that by changingthe B in the second position to an L, the correct expression is arrivedat. The lower level elements remain the same which, through the rules ofpositional integrity, means that the detail description is identical butwe may now be talking about a member of staff in the NHS or a member ofstaff in the prison service.

A further application of this feature is that the data model can bearranged to permit data type context changes to be made by changingperhaps only one higher order digit in the expression. For example, thehigh order character I₂ is chosen to represent the context of the datamodel—eg. “presentation” or “response”, then, as shown in FIG. 4 forexample, whilst diagnosing a particular disorder at a detailed level, bychanging the value of one high order element in the specification word,the user can be left in perhaps the response region of expression codesfor this particular disorder. In the example of FIG. 4, this isillustrated by changing the second order element in the context wordfrom “G” to “V”.

An overview of the use of an expression set together with theimplementing tables which comprise an illustrative embodiment of thedatabase system of the present invention is now described with referenceto FIG. 5.

Every occurrence of an entity about which information must be stored isrecorded in the entity details table 510. Each occurrence of each entityis given a unique identifier 512 which is assigned to that entityoccurrence, and information about the entity is stored as a valueexpression information string 513. Examples of value expressions are thecharacter strings giving names, street addresses, town, county, countryetc, or drug name, manufacturer's product code etc. These details areessentially alphanumeric strings which themselves contain no furtheruseful hierarchical information and are treated solely as characterstrings. As will become apparent later, the decision as to whichoccurrence values are handled at this level is determined by the user'srequirements. For example, an address may be recorded entirely ascharacter strings having no further hierarchical significance.Alternatively, the county or city field, or postcode portion of anaddress might usefully be encoded into an expression in order that rapidsearching and sorting of, for example, geographical distribution ofpatients becomes possible.

Entering this information may be regarded as a registration activity, inthat static information about an object's existence is being recorded inthe database.

Attributes which may only take permitted discrete values from a set ofpossible values may be effectively recorded in the expression I₁ to I₁₅associated therewith as will be described later.

The unique identifier 512 of each entity occurrence in the entitydetails table 510 provides a link to an entity history table 520 whereentry of, or update to the entity occurrence status is stored. In thistable, the event updating the database is given a date and/or time 524,an expression 526, and the unique identifier 522 to which the recordpertains, and may include other information such as the user ID 527 ofthe person making the change.

This activity is “profiling”: in other words recording information aboutthe entity and its relationship with the data model. An example of thisis assigning to PatientID1 (from the entity details table 510) anattribute value, HospitalNo1 by use of the appropriate byte I₅ in theexpression.

In the entity history table 520, various details of the event beingrecorded may not be available, or may have no relevance at that time.For example, a new patient in a designated hospital may be admitted, andsome details put on record, but the patient is not assigned to anyparticular doctor or ward until a later time. Additionally, someinformation may be recorded which is completely independent of the userview or other context information. Thus the event is logged with onlyrelevant bytes of the expression encoded. Bytes for which theinformation is not known, or which are irrelevant to the event arenon-deterministic and are filled with the wild card character, “#”.

The entity history table 520 may also include an event tag field 528which can be used in conjunction with a corresponding field in anepisode management table to be described hereinafter. It will indicatewhich coding activity was being carried out when the expression wasassigned to the entity. For example, this tag could indicate whether thecoding was carried out during an initial assessment, an update, acorrection, a re-assessment, etc. This tag also orders entity codes intoevent groups. For example, in the medical context, when a person entersthe system as a patient, they initiate an episode. An episode can havemany spells, and a spell can consist of many events. What is more, apatient can be undergoing more than one episode at a time, and undereach episode, more than one spell at a time. Many organisations need tostore this sort of information for costing and auditing purposes. Bycoding this information into an expression, it will be possible tobrowse this information.

The entity history table may also include a link field 529 which isdesignated to link related groups of codes allocated during a particularentity-event-times. For example, in a social services application, ahome visit, a visit date, miles travelled and the visitor could all havean expression associated with the visit event. The link field will linkthese expressions together. Alternatively, the event tag field may alsocater for this function.

A memo field 523 may also be included in the entity history table toallow the user to enter a free text memorandum of any length for eachcode allocated to an entity. In effect, every time a field is filled, amemo can be added.

The expression set of the entire database is recorded in a third table,the expression set table 530. This encodes each expression against itsnatural language rmeaning, and effectively records the data model asdefined by the hierarchical structure of FIG. 1. There is a naturallanguage meaning for each byte of the expression, each byte representinga node position in the data model tree, and the precise significance ofevery occurrence of every entity or attribute is provided byconcatenating all natural language meanings for each byte of theexpression: eg. NHS—Presentation Data Type—Administrator'sView—Region1—HospitalNo2—Doctor Record—Name—DoctorID1.

As has been discussed previously, the expressions may include expressionextenisions which map a sub-tree onto the main tree For convenience,these extension expressions can be located within the expression settable 530 (the extension entries being identified by the byte I₁, orcould be located in a supplementary table (not shown), in which thepointer fields I₁₁ to I₁₅ of the main expression are used as the firstfields I₁₁ to I₁₅ of the extension expression.

The entity history table 520 and the expression set table 530 may eachinclude an extra field holding a version code. In the entity historytable, this would indicate a version number of the expression in use atthe time the record was created; in the expression set table,expressions may be varied over time according to the version code given.This allows the structure of the hierarchy to change over time withoutnecessarily introducing new expressions. This assists in maintainingbackward compatibility of recorded data.

Further details of the tables and their structures will be discussedhereinafter. In use, the database management system first constructs thedata model tree structure in the expression set table 530, with eachexpression being allocated a corresponding natural language term. Thiscan be done by dialogue with the user, or by systems analysis by anexpert. Preferably, the use of pre-formatted codes representing certaindata strictures are used by many different users. For example, personnelfile type structures may be used by many different organisations. Thisallows compatibility of databases to allow data sharing betweenorganisations, with users being allocated blocks of codes for their ownuser-specific purposes, as well as using shared codes which have alreadybeen defined by a higher authority.

In FIG. 6 a, an exemplary expression set table portion 600 representinga personal details sub-tree is shown. It will be observed that fields I₈and I₉ represent the personal detail sub-tree data structure which canbe replicated for any part of the tree. That is to say, the sub-tree 601can represent attributes of a patient (as shown) or in a different partof the tree may represent attributes of a prisoner, or member of staff.Note that the “names” grouping 602 (I₈=“1”) provides a sub-tree ofentity attributes, eg. “surname”, “first name” etc., each of which willhave a number of entity occurrences associated therewith, each havingspecific values. Each occurrence will be separately identified using thelower order fields of the expression (not shown). The actual values willbe installed as character strings in the entity details table 510 (seeFIG. 5). By contrast, the country of,origin entity 603 (I₈=“5”) providesa sub-tree of discrete entity attribute values: eg. “England”,“Scotland”, “Wales”, “Belgium”, “France” etc. Thus, in this case, thetree structure (ie. the expression itself) can provide the individualattribute values of the entity “Country of Origin”.

In FIG. 6 b, a further exemplary expression set table portion indicatesa sub-tree relating to diagnoses on a patient. Only the expressionvalues I₁₁ to I₁₅ are shown for brevity. This expression sub-setprovides a sub-tree of possible attribute values relating to diagnosesor operations etc. As mentioned above, these attribute values fordiagnoses might correspond to industry or professional standardclassifications such as ICD 10 or DSM IV.

In FIG. 6 c we show an expression set table portion representing asub-tree which can be used to provide a “standard” range of discreteattribute values relating to angles between 0 and 180 degrees. In FIG. 6d we show an expression set table portion 630 representing categories ofmedical presentations.

In other contexts of use, these expression subsets can cover qualifiertypes such as “weight”, “length”, “colour”, “temperature” etc, each withpossible respective scales of uses, such as human weights in kg, cartravel distances in miles, colour in Pantine code, temperature in Kscale, respectively.

FIGS. 6 a-6 c demonstrate a further possible embodiment of the databaseimplemnentation of the present invention. In FIG. 6 a it is noted thatbytes I₁₁ to I₁₅ are not shown. In practice, there can be someadvantages in operation in constructing separate tables to containdiscrete “chunks” of the expression set table 530, that is chunksrelating to adjacent groups of I₁₁ to I₁₅ codes which all relate to thesame I₁ to I₁₀ value. These each form an extension table 540 which ispointed to by an extension table pointer located in column 606 of theexpression set table 600. This is particularly useful where repeatingchunks of I₁₁ to I₁₅ codes are found in many places throughout theexpression set table 530. Thus sub-tables 610, 620, 630 could be tablesin their own right, not forming part of the main table, and can thus beused at numerous locations down the main table 530.

According to a preferred embodiment of the present invention, the use ofseparate sub-tables such as those in FIGS. 6 b and 6 c enables differentuser views of the same basic data to be readily accommodated.

For example, with reference to FIGS. 7 a and 7 b, different users mayrequire different views of the same or similar data. In FIG. 7 a, anexpression set table (or table portion 701, as shown) lists relevantfields of an expression set table 530 corresponding to fields I₁₁ toI₁₅. In this example, however, the expression set uses a number ofnumeric codes rather than the single character alphanumeric codesrepresented in FIGS. 6 a to 6 d. In addition, the numeric code “−1” isused instead of the wild card character “#”. The corresponding naturallanguage terms 535 are now found in a clinician's terms table 702 linkedto the expression set table 701 by a linking index 703 common to eachtable The clinician's terms table 702 provides a series of correspondingnatural language terms, or more generally, data definitions, asapplicable to and understood by a clinician.

In FIG. 7 b, the same expression set table portion 701 with linkingindex 703 is now shown together with corresponding natural languageterms 535 in a patient's terms table 704 which provides a series ofcorresponding natural language terms as might be applicable to, andunderstood by a patient.

It will be understood that the natural language terms table 702 or 704that is relevant in any particular situation, can be determinedaccording to one or more higher level fields I₁ to I₁₀ (not shown) inthe expression. As will also be explained later, the determination ofwhich sub-table is relevant may also, or alternatively, be determinedwith reference to a user identity.

When taken in combination with natural language expressions derivingfrom the higher level fields I₁ to I₁₀ (not illustrated in FIG. 7 a orFIG. 7 b), a context of use provides the full “profile view” asindicated by the headings and sub-headings in table 710.

The illustrations of FIGS. 7 a and 7 b relate to what is described as ageneral view, ie. natural language expressions are provided to apredetermined degree of contextual specialisation. More specialist viewsare illustrated in FIGS. 8 a and 8 b where a more detailed contextuallevel is provided. For example, the psychological health context inFIGS. 7 a and 7 b is broken down further into mental health andbehaviour categories only. In FIGS. 8 a and 8 b, the clinician's viewsand the patient's views of psychological health context are sub-dividedinto much more detailed categories as shown in table 810 (mentalhealth—thought—thought content—somatic preoccupations; hallucinations;anxiety; behaviour etc). This additional contextual information isprovided by virtue of a greater degree of resolution in the expressionsets in table 801. Here it will be noted that many expression set fieldsthat were previously non-deterministic (“−1”) are now specifiedprecisely, thereby providing the higher degree of contextualspecification.

It will be understood that corresponding clinician and patient viewterms tables 702 and 704, 802, 804 need not have a one-to-onecorrespondence between corresponding data definitions. For example,where one user view requires a different level of granularity ofinformation content, broader qualitative or quantitative datadefinitions may be found in the table 704 than in the table 702.

In FIGS. 6 a-6 d, the expression set table uses a standard notation.Because of the hierarchical nature of the expressions, it is essentialto maintain positional integrity. Thus, with reference to FIG. 6 a, thepatient sub-tree must commence at level I₈ of the tree structure,regardless of the complexity of the tree structure above. Thus, if thereis only one organisation using the database, or if there a limitednumber of user views required, there may be no requirement to use somehigher order context specifiers (eg. I₄ and I₅). These unused fieldshave no specification at that point and are represented by “#”. Wherethere is no specification this is represented in the natural languageterm field 605 by the symbols “<>”. It will be understood that theseparticular choices of special characters are entirely arbitrary. Inpractice, each “character” of the expression set may be encoded by atwo-byte binary word, for which specific values may hold specialmeaning.

In constructing the table, for implementational reasons discussed later,it is highly desirable that the table is maintained in strictalphanumeric order of expressions, with discontinuities between higherand lower tree branches filled in with blank specification lines (ie.those represented by “<>”). It will be understood that these correspondto particular levels within the tree structure for which there are nodivisions of branches.

Additional fields may be included in the expression set table. Forexample, a note flag field 532 may be used to signify that explanatoryinformation is available for a term. This would typically provide apointer to a notes table. A symbol in this field could indicate theexistence of, for example, passive notes (information available onrequest); advisory notes (displayed when the code is used); andselection notes (displayed to the user instead of the natural languageterm) A sub-set field 533 may also be provided for expressionmaintenance tasks, but these are not discussed further here.

When an expression set table has been constructed, it can be related toindividual entity occurrences in the following manner. As previouslydiscussed, the unique occurrences of entities can be placed in theentity details table 510, each having a unique identifier 512. This islinked to the expression set table, and thus to the tree via the entityhistory table. This records the entity unique identifier 512 in a column522 and links this with the appropriate expression or part expression526. The date of the event is logged in field 524, and other details maybe provided—eg., whether the data entry is a first registration of arecord, whether it is a response record (eg. updating the database) etc.

Other tables may be used beyond those described in connection with FIG.5, or the tables structured differently. In one embodiment, theexpression set in table 530 is used to identify entities and attributesof entities, together with individual occurrences of entities that donot change over time. Details of occurrences of entities that aretransient to the data model may be recorded in a separate table, such asthe entity history table 520. Such transient objects may be, forexample, individual personnel whose existence in the data model isimpermanent or whose function (place) within the data model may changeover time (eg. by promotion of staff or transfer within theorganisation). In this instance, the unique identifier 522 and date/timefield 524 relative to the expression field 526 indicate the function ofthat entity occurrence at that time.

The entity ID table 550 (FIG. 5) is an example of a secondary tablewhich is used when communicating and sharing data with other systems.This table matches the entity unique identifier ID codes with entity IDcodes used by other systems.

It is also possible to record static entity details in a form which isstructured ready for input and output For example, name, address andtelephone records may be stored in successive columns of an addresstable 560, each record cross-referenced to the main data structure bythe expression code or cross-referenced to an entity by the expressioncode I₁ to I₁₅. The link can thus be made with either the expression settable 530 or the entity history table 520. Then, whenever that branch ofthe tree is accessed pertaining to one individual record, the fullstatic and demographic details of that entity occurrence may be accessedfrom a single table.

A similar arrangement is shown for providing detailed drug information,by drug table 570.

A further modification may be made to the embodiments described above inrespect of the use of the entity details table 510. It is not essentialfor all information about an entity occurrence to reside in the entitydetails table 510. In some models, it is advantageous to restrict theuse of the entity details table 510 to that of a “major entity” only—themost significant entity forming part of the modelled organisation. Forexample, in the hospital environment, the patient could be chosen as themajor entity. In this case, all other (non-structural, character-string)information about entities can be located in an appropriate field ofeither the entity history table 520, or the expression set table 530. Inthe case of the entity history table 520, an appropriate field to use isthe memo field 523, and in the case of the expression set table 530, anappropriate field to use is the natural language term field 535. It willthus be understood that, where the non-structural information held abouteven the major entity is small, the entity details table 510 can bedispensed with all together.

Reporting

The present invention offers significant advantages in the execution ofreporting and database querying functions particularly for multipleusers or multiple classes of users.

To answer a given query, the database system defines a query expressioncomprising fifteen bytes which correspond with the expressions as storedin the entity history table 520 and expression set table 530. The queryexpression will include a number of deterministic bytes and a number ofnon-deterministic bytes. The non-deterministic bytes are effectivelydefined as the wild-card character “#”—“matches anything”. Thedeterministic bytes are defined by the query parameters.

For example, a simple query might be: “How many patients are presentlyregistered at hospital X.” To answer this query, the query expressionimposes deterministic characters in fields I₁ (=NHS), I₄ (=hospitalidentity), I₆ (=patients). Other context information may be imposed byplacing deterministic characters in bytes I₂ (=presentationinformation). All other bytes are non-deterministic and are set to “#”.The database scans through the expression set table matching thedeterministic characters and ignoring others. Note that in the preferredembodiment, the expression set table is maintained in strictalphanumeric sequence and thus very rapid homing in on the correctportions of the database table is provided where high-order bytes arespecified. This will normally be the case, since the hierarchical natureof the expression set will be arranged to reflect the needs of theorganisation using it. The database system can then readily identify allthe tuples of the expression set table providing a match to the queryexpression.

A significant advantage of the database structure will now becomeevident. The answer to the initial query has effectively homed in on oneor more discrete portions of the expression set table and counted thenumber of tuples matching the query expression. Supposing that the usernow requires to “progressively browse” by stipulating additionalconditions: “How many of those patients are being prescribed drug Y”requires only the substitution of the non-deterministic character “#”with the appropriate character in the requisite field I_(n) of theexpression to change the result. Similarly, carrying out statisticalanalysis of other parameters, such as: “How many patients were treatedby doctor Z with drug Y” can rapidly be assessed. It will be understoodthat progressively narrowing the query will eventually result in allbytes of the query expression becoming deterministic and yielding nomatch, or yielding a single patient entity match whose details can thenbe determined by reference to the entity details table 510 (or theappropriate memo field).

The key to the speed of result of the statistical querying function isthe construction of the expression set table. When imposing conditionson various attributes of an entity, ie. by setting a deterministiccharacter in a byte of the query expression, the relevant data will befound in portions of the table in blocks corresponding to thatcharacter. Progressive sub-querying requires only scanning portions ofthe table already identified by the previously query. Even where ahigher level context switch takes place, relevant parts of theexpression set table can be accessed rapidly as they appear in blockswhich are sequenced by the expression hierarchy.

Scanning the table can be achieved most efficiently by recognising thatonly the highest order, deterministic byte of the query expression needbe compared with corresponding bytes of each record in the expressionset table until a first match is obtained. Thereafer, the next highestorder byte must be included, and so on until all deterministic bytes arecompared. This results from maintaining a strict alphanumeric orderingto the table.

A second type of querying relates to examining the historical aspects ofthe database. For example, the query may be, “In the last year, whatdrugs and quantities have been prescribed by doctor X?” To answer thisquery, the query expression is formulated in the same manner as before,imposing deterministic bytes in the appropriate places in the queryexpression. This will include one or more “lowest order” bytes in I₁₁ toI₁₅ which actually identify a doctor, and non-deterministic charactersagainst the drug fields. This time, however, the entity history table520 is scanned, in similar manner, seeking only matches of deterministiccharacters. In a preferred embodiment, the entity history table will bemaintained in chronological sequence and thus the search can be limitedto a portion of the table where date limitations are known and relevant.Matches of deterministic characters will be found throughout the tablewhere a relevant event relating to prescription of a drug by doctor X isfound. Note that the entity history table may include other fields whichcan be used to impose conditions on the query, such as the user ID ofthe person entering the record.

A third type of querying relates to analysis of the records pertainingto a single entity value: the entire medical record of patient X. In thepreferred embodiment, patient X would be identifiable from the entitydetails table 510. The query would initially involve searching for thepatient's name to locate the unique identifier (unless that was alreadyknown). Once the unique identifier for a patient was known, then theentire entity history table can be scanned very rapidly for any entryincluding the unique identifier. The strengths of the present inventionwill then be realized in that the output from this scan will provide anumber of entries each of which carries all of the relevant informationabout that patient incorporated into the extracted expression bytes I₁to I₁₅. The entire patient's record can then be “progressively browsed”without recourse to any further searching operation on the main entityhistory table. Specific details of the patient's treatments, doctors,hospital admissions, prescriptions etc are all very rapidly available atwill be assertion of appropriate deterministic bytes in the expressionI₁ to I₁₅.

It is noted that the event history table will include many records wherethe expression stored in the record contains many non-deterministicbytes. For example, where a doctor X prescribes a patient Y with drug Z,other bytes of the expression may be either not known, or not relevant.For example, the patient may have been assigned to a ward W in thehospital which could be identified by another byte. However, this venuein which the treatment took place might be: a) unknown; b) known but notrelevant to the record; or c) automatically inferrable from the contextof the person making the record entry. Whether this information isincluded in the record is stipulated by the users; however, it will benoted that it does not affect the result of the query whether the bytein the entity history table relating to WARD W is deterministic ornon-deterministic, because the query expression will set that relevantbyte to non-deterministic unless it is stipulated as part of the query.

When the database system has extracted all of the records of the entityhistory table matching the query expression, it preferably saves theseto a results table for further querying, or progressive browsing. Forexample, the results table can then be analysed to identify whichtreatments were made at an individual hospital, or by an individualdoctor by setting additional conditions on particular bytes of the queryexpression. Memo fields can be extracted to view comments made at thetime of treatment. It can be seen that the results table formed inresponse to the initial query actually contains all of the informationrelevant to a given patient's treatment, and not just the answer to theinitial query “What drugs have been prescribed to patient X?”

In summary, the information of the database is stored in such a mannerthat data for a query may be extracted far more rapidly than relationaldatabase storage schemas, and with an expression for each extractedrecord. The presence of this expression in the query result has animportant effect. A unique reporting benefit gained is the scope forprogressive browsing and “interactive reporting”. When a database queryis executed to provide information for a report, the answer will be madeup of a number of expression records. This subset of expressionsinherits all the structural information held in the main expression set.

As a general example: a detailed report on the number of severehallucination instances in a given geographical area during the pastyear might return a subset of 12,000 expressions. Because these are fillexpressions, higher and lower level information is also inherent in thissubset. Further investigation of the answer through browsing thereturned hierarchy might reveal that 70% of cases were male, or 30% ofcases occurred in the prison service, etc. Similarly, a high levelreport on the number of instances of hallucination in a particularorganisation might return a subset of 9,000. More detailed informationwill be inherent in this retrieved subset. By progressive browsing ofthis subset, it may transpire that 90% of mild occurrences were inplanning departments or that 5% of severe occurrences were in educationdepartments. The processing time required to browse this informationwith further, more detailed, “sub-queries” is substantially speeded upover prior art systems simply because the expression set readilyprovides all the lower level information.

With reference to FIG. 9, there is described a profile processor thatparticularly facilitates the input and output of queries and dataaccording to specific requirements of a variety of users. The profileprocessor is adapted to allow different views or profiles of the datastored in the database according to the individual user, or class ofuser. The profile processor is particularly suited in its specificfunctionality for hardware implementation using programmable electronicgate circuitry (eg. uncommitted logic arrays or ASICs) and dedicatedvolatile and non-volatile memory.

In the present invention, it has been recognised that the expression I₁to I₁₅ encoded in the expression set table 530 and in the entity historytable 520 can be used not only for matching against a query expressioncomprising a selection of deterministic and non-deterministiccharacters, but also for deploying a set of profile expressions, alsoeach comprising a selection of deterministic and non-deterministiccharacters, that can be used to control the output and display of searchresults according to the individual user.

The profile processor 901 effectively acts as a filtration stage inconjunction with a query processor 902. A user input 903 provides aquery expression 904 comprising a selection of deterministic charactersand non-deterministic characters “#”. As previously explained, recordswill be extracted from the entity history table 520 by the queryprocessor 902 whenever a match of every deterministic character in thequery expression 904 matches a corresponding deterministic character inthe expression field 526 of the entity history table 520. Extractedrecords will be passed through to the profile processor stage 901. Theprofile processor 901 obtains a series of user profile expressions 905from a user profiles database 906, according to the identity of a userlogged into the system, or according to the class of user logged intothe system. Each of these user profile expressions 905 comprises a setof deterministic characters and non-deterministic characters. The userprofile expressions define deterministic fields of the expressionsextracted by the query processor that must match the extracted recordsin order to allow the record to be passed through to the display. In thepreferred embodiment, the set of user profile expressions 906 filter theextracted records on a Boolean OR basis, ie. for each extracted recordthere must be a match with at least one of the user profile expressions.It will be understood, however, that an alternative record filtrationbasis would be to filter the extracted records on a Boolean AND NOTbasis, ie. for each extracted record, there must be no deterministiccharacter matches with any user profile expression. In this case, theuser profile expressions would define areas of the database to beexcluded.

The user profile database 906 may also provide an indication of whichexpression set table 530 and/or sub-tables 610, 620 should be used for aspecific user profile to generate the data description or naturallanguage term corresponding to the extracted record. For example, withinthe expression set table 530, or linked thereto, a plurality of distinctsub-tables 701, 702 or 801, 802 (as described with reference to FIGS. 7and 8) may be provided. Each sub-table may be specific to a particularuser or group of users. As the profile processor 901 filters theextracted records retrieved from the entity history table by the queryprocessor, the expression 526 from any records that match both the queryexpression 904 and the user profiles 905 is used to extract the datadescription or natural language term from expression set table accordingto the sub-table 701, 702, 801, 802 that is prescribed by the userprofile provided by user profile database 906.

In use, the database system would be established to give each user aprofile in the user profile database. The user profile would include theset of profile expressions 905 used to filter data retrieved by thequery processor. The user profile would also include a set of sub-tablepointers each pointer corresponding to one expression in the expressionset table 530, that will indicate which sub-table 701, 702, 801, 802should be used when matching the retrieved expression. The user profiledatabase can also be used to specify the layout and structure of displaypresented to any particular user or class of users.

As an illustration, there may be five general classes of views. A“discipline view” may be provided for each user discipline, such as“nurse”, “doctor”, “hospital administrator”, etc. These views willfilter for different sets of data, according to the requirements of thediscipline. Similarly, a “specialist view” may be provided for eachsub-group of the disciplines, eg. the class “doctor” may have optionalspecialist views of “cardiac specialist”, “ENT specialist” etc in whichdifferent levels of detail of information are filtered by the profileprocessor. Another class of view, the “perspective view”, may presentthe same essential information, but use a different sub-table 610, 620to provide the natural language terms—a perspective view for separategroups of persons, such as “doctor” and “patient” can be provide so thateach class of person can see the data presented in a comprehensibleformat.

Note that although the illustrative embodiment shows the query processoras the first record extraction stage from entity history table, and theprofile processor as the second stage, it will be understood that thesetwo operations could be reversed, although this would be very much lessefficient.

The query processor 902 may also be provided with capability forgenerating “event views”, in which the records are filtered according todate/time stamps in the entity history table. Similarly, “key views”comprising specific predetermined information (data types) regarding oneentity occurrence (eg. a specific patient) may be provided. In these“key views”, the specific data types selected for view may be chosen onthe basis of fixed data types for a given type of entity, eg. certaincategories of biophysical data for a patient). Alternatively, the datatypes selected for display may be variable based upon data values for agiven entity. For example, the key view for any patient may be basedupon those data types for which a data value is stored that holds datavalues that are scored above a predetermined critical value, or outsidea predetermined critical range. This provides each entity occurrencewith its own “key view” of a few items of importance, eg. key problems.In reporting, it becomes possible to select a patient and then veryquickly extract for display the total population of entities that sharethose key problems so as to generate a real time empirical normativeview against which to compare the single patient.

In the latter instance, the profile processor 901 may call upon userprofile expressions 906 that identify specific quantitative data valuesindicated by the expression, eg. where an expression in the entityhistory table 520 indicates that a patient has a value of blood pressurethat is above recommended limits.

It will understood that the entity history table 520 represents a log ofevents recorded over time against a plurality of entities, entityoccurrences or attributes thereof. Each event could relate, for example,to a specific assessment, diagnosis or treatment event of a patient in ahospital. The particular type of event (eg. diagnosis of a specificcondition; measurement of a quantitative physiological parameter such asblood pressure, heart rate, etc; qualitative assessment of particularcondition in the patient; or application of specific treatment) isindicated according to the particular expression 526 logged in thetable. The quantitative or qualitative value ascribed to the event maybe entered as an information string eg. in table portion 523, or may beencoded in the expression 526 itself, as explained above.

With reference to FIG. 10, it will be seen that an entity historyrelating to, say one patient or collection of patients, may comprise aseries of assessment events or items 1001 . . . 1010 which may be spreadover a time period T_(A). Each assessment item may be an answer, orresponse 1015, to an assessment question 1014. Thus, each item 1001 . .. 1010 within the time period T_(A) should generally be retrieved fromthe database when seeking to extract patient information for aparticular assessment or treatment episode. This can readily be achievedin the present invention by setting the deterministic characters of thequery expression 904 to those that correspond to the patient andassessment or treatment episode while leaving as non-deterministic thosecharacters that relate to the individual items or events within thetreatment episode. The data can readily be limited to the time periodT_(A) simply by scanning only the small portion of the entity historytable 520 covering that time period. It will be recalled that the entityhistory table is preferably maintained in chronological sequence.

More generally, however, and with reference to FIG. 11, certainassessment items (eg. 1001, 1004, 1006) may have multiple responses (eg.1101, 1104, 1106) where data are gathered or tests carried out more thanonce. It will be clear that in any true appraisal of the data relatingto a given assessment episode 1100, records over the time period T_(A)should be taken into account, conventionally by a statistical processsuch as averaging. The output processor 910 preferably handles thisprocess.

The structure of the data records in the database of the presentinvention enables very rapid data extraction over a predetermined timewindow, since event records in the entity history table are inchronological sequence. Still further, the time window over whichrecords are extracted from the entity history table can also bespecified not as a pair of time limits between which records should beextracted (eg. “T_(min)<T_(extract)<T_(max)”), but as a band around aspecified target time (“T±Δ”), referred to as a “width of now”. Thisessentially defines the granularity of the data extraction required.

With reference to FIG. 12, the effect of providing an ability to specifyΔT at will on data extraction is readily apparent. Specifying a targettime T_(N1) and “width of now” value ΔT₁ ensures that all the data fromthe relevant assessment episode 1201 is captured. Similarly, specifyinga target time T_(N2) and “width of now” value ΔT₂ ensures that all thedata from two adjacent assessment episodes are captured. By simplydoubling or tripling the value of ΔT₂, data from all four episodes 1201. . . 1204 would be captured. Such a change in ΔT value in real timeduring querying would have a very minor effect on data extraction time,since the query expression 904 remains unaltered and the contiguousportion of the entity history table 520 being scanned is merely expandedslightly.

Thus, a user of the database may quickly re-specify ΔT during a querysession to review the effects on average data, almost in real time.

Still further, ΔT may be automatically specified (or provided with adefault value) according to the query expression 904. In other words,there may be provided a series of default values for ΔT for differenttypes of records that automatically ensure that data is extracted to anappropriate degree of granularity. ΔT values might be inferred from boththe query expression or possibly from the user profile.

It will be understood that while the example given relates to extractingdata in respect of perhaps a single patient, a user gathering generaldata for a treatment plan over many patients (eg. several hospitals)need only modify the query expression 904 to change a deterministiccharacter representing a specific entity occurrence to anon-deterministic character covering all entity occurrences for thatparticular treatment plan.

With reference to FIG. 13, an alternative table structure for theexpression set table 530, 540 and entity history table 520 of FIG. 5 isshown. In this embodiment, the expression set table 1300 is divided intoan item expression portion 1310 providing: in columnn 1311 the main partI₁ to I₁₀ of the expression broadly relating to hierarchical position ofthe entity in the data model; in column 1312 the expression qualifierdata type indication characters I₁₁, I₁₂ (indicating, eg. that the entryrelates to a measurement of length) and in column 1313, an index to aseries of tables 1330 to 1360 providing information relating to thatexpression, such as the natural language term, numeric value, freeformnotes or links to other attributes.

The entity history table 1350 is also divided into a “main” entityhistory table 1360 and a “transient” entity history table 1370. The mainentity history table 136 is used for profiles of entities that are keyto the context of the organisation being represented, and are alsogenerally speaking permanent entities. In a health service context,these “main” entities would be patients. The “transient” entity historytable 1370 carries the histories of other entities within theorganisation, eg. staff, next of kin, locations, facilities etc. Eachentity history table 1360, 1370 comprises a unique identifier field1361, 1371; a expression field 1362+1364, 1372+1374; an event date/timefield 1363, 1373 and a memo field 1365, 1375, in common with theembodiment description in connection with FIG. 5. However, in thispresently preferred embodiment, the tables also include an eventidentifier field 1366, 1376 which records either an item instance or aqualifier instance. This is to distinguish between independent orsuccessive events that relate to an attribute item or qualifier.

In practice, events which occur relating to an attribute of an item canbe events that occur in parallel or in series. For example, an entityhistory table entry may indicate that at time T1 the attribute “colour”of entity1 was “RED”. At a later time, the entity history table mayrecord that at time T2, the attribute “colour” of entity1 was “BLUE”.Either:

-   (i) the entity1 has two colours (scenario 1), or-   (ii) the entity1 colour has changed over time (scenario 2), or-   (iii) the entity1 actually comprises two discrete items (scenario    3).

These different scenarios are recorded by distinguishing in the eventidentifier field 1366, 1376 the two entity history events as

-   (i) instance1, time1, item1 and instance1, time2, item1 (scenario    1), or-   (ii) instance1 time1, item1 and instance2, time2, item1 (scenario    2), or-   (iii) instance1, time 1, item1 and instance2, time2, item2 (scenario    3).

Also included in the event history tables 1360, 1370 is an event typefield 1367, 1377 that may be used for rapid retrieval of all similarevents (eg. assessments, diagnoses, care records, registrations etc.

The present invention can be readily realized both in software, and inhardware. It will be understood that the database querying essentiallyrequires rapid fifteen byte wide comparison of the expressions I₁ toI₁₅. An extremely fast co-processor ASIC could thus be manufacturedwhich includes up to fifteen eight-bit comparators in parallel. Inpractice, querying would never require all fifteen bytes to be compared,as most queries involve the setting of a large number of the bytes to anon-deterministic state, thus in practice requiring fewer parallelcircuits and enabling simplification of the design of a dedicatedco-processor.

1. A method of operating a database system comprising the steps of:assigning to each of a plurality of entities, attributes and entityoccurrences, a unique, multi-character expression, the expression havinga predetermined hierarchical structure which defines the relationshipbetween each entity, attribute and entity occurrence; storing saidexpressions in an expression set table linking each element of eachexpression with a data definition relating the expression to ahierarchical level and a position in a data model; recording events inan entity history table, each event having associated therewith arelevant expression from the expression set table; extracting recordsfrom the database according to a multi-character query expressioncomprising characters which are deterministic to the query andcharacters which are not deterministic to the query; filtering theextracted records according to a plurality of multi-character profileexpressions each comprising characters that are deterministic andcharacters that are non-deterministic and which together definefiltration criteria; outputting only extracted records that meet thefiltration criteria and match the query expression.
 2. The method ofclaim 1 wherein the extracting step comprises: scanning at least aselected portion of the entity history table to examine the expressioncontained in each record; matching every deterministic character of thequery expression with every deterministic character in the examinedrecord; and where each deterministic character of the query expressionmatches the respective record expression, extracting the record.
 3. Themethod of claim 1 wherein the filtering step comprises: matching everydeterministic character of each profile expression with everydeterministic character in the extracted record and discarding saidrecord unless each deterministic character of the extracted recordmatches each deterministic character of at least one profile expression.4. The method of claim 1 further comprising the step of maintaining auser profile database storing, for each of a plurality of users orclasses of user of the database system, a respective set of saidmulti-character profile expressions.
 5. The method of claim 4 furthercomprising the steps of: storing a plurality of corresponding datadefinitions for each of a plurality of said multi-character expressionsin said expression set table; and maintaining, in said user profiledatabase, for each of said plurality of users or classes of user, anindication of which of said data definitions are to be associated witheach multi-character expression.
 6. The method of claim 5 wherein saidplurality of corresponding data definitions are maintained in aplurality of sub-tables linked to the expression set table, and whereinthe user profile database identifies at least one sub-table for eachuser or class of user.
 7. The method of any one of claims 4 to 6 furthercomprising the step of maintaining, in said user profile database, foreach of a plurality of users or classes of user, an output format forcontrolling the display of said outputted, extracted records.
 8. Amethod of operating a database system comprising the steps of: assigningto each of a plurality of entities, attributes and entity occurrences, aunique, multi-character expression, the expression having apredetermined hierarchical structure which defines the relationshipbetween each entity, attribute and entity occurrence; storing saidexpressions in an expression set table linking each element of eachexpression with a data definition relating the expression to ahierarchical level and a position in a data model; recording events inan entity history table, each event having associated therewith arelevant expression from the expression set table; extracting recordsfrom the database according to a Boolean combination of (i) amulti-character query expression comprising characters which aredeterministic to the query and characters which are not deterministic tothe query and (ii) a plurality of multi-character profile expressionseach profile expression comprising characters that are deterministic topredetermined filtration criteria and characters that arenon-deterministic to predetermined filtration criteria, by selectingrecords in which every deterministic character of the Booleancombination matches a corresponding deterministic character in saidexpressions in the database; and outputting said extracted records. 9.Database apparatus comprising: means for storing, for each of aplurality of entities, attributes and entity occurrences a unique,multi-character expression, the expression having a predeterminedhierarchical structure which defines the relationship between eachentity, attribute and entity occurrence; means for storing saidexpressions in an expression set table linking each element of eachexpression with a data definition relating the expression to ahierarchical level and a position in a data model; means for storing, inan entity history table, a plurality of recorded events, each eventhaving associated therewith a relevant expression from the expressionset table; a query processor for extracting records from the databaseaccording to a multi-character query expression comprising charactersthat are deterministic to the query and characters that are notdeterministic to the query; a profile processor for filtering theextracted records according to a plurality of multi-character profileexpressions each comprising characters that are deterministic andcharacters that are non-deterministic and which together definefiltration criteria; and output means for generating an output of allextracted records that match the filtration criteria.
 10. A method ofoperating a database system comprising the steps of: assigning to eachof a plurality of entities, attributes and entity occurrences, a unique,multi-character expression, the expression having a predeterminedhierarchical structure which defines the relationship between eachentity, attribute and entity occurrence; storing said expressions in anexpression set table linking each element of each expression with a datadefinition relating the expression to a hierarchical level and aposition in a data model; recording events in an entity history table,each event having an event time and a relevant expression from theexpression set table associated therewith; and extracting records fromthe entity history table according to a multi-character query expressioncomprising characters that are deterministic to the query and charactersthat are not deterministic to the query, and according to apredetermined time window function.
 11. The method of claim 10 whereinthe predetermined time window function is a function of themulti-character query expression.
 12. The method of claim 10 or claim 11wherein the predetermined time window function is retrieved from a userprofile database.
 13. The method of claim 10 wherein said extractingstep includes the steps of: receiving, from a user, said multi-characterquery expression and a specified time value or range of values for saidrecord extraction; and determining a time band value for expanding saidspecified time value or range to automatically capture records apredetermined distance outside said specified time value or range ofvalues.
 14. The method of claim 13 further including the step ofaggregating the captured records according to a predeterminedstatistical process.
 15. Database apparatus comprising: means forstoring, for each of a plurality of entities, attributes and entityoccurrences, a unique, multi-character expression, the expression havinga predetermined hierarchical structure which defines the relationshipbetween each entity, attribute and entity occurrence; means for storingsaid expressions in an expression set table linking each element of eachexpression with a data definition relating the expression to ahierarchical level and a position in a data model; means for recordingevents in an entity history table, each event having an event time and arelevant expression from the expression set table associated therewith;and means for extracting records form the entity history table accordingto a multi-character query expression comprising characters that aredeterministic to the query and characters that are not deterministic tothe query, and according to a predetermined time window function.